Vector Autoregression

Week 12

Alex Cardazzi

Old Dominion University

Topics

  • Vector Autoregression

Autoregression

\[Y_t = f(Y_{t-1}, ..., Y_{t-p})\]

  • AR(p) models suggest the history of some variable will teach us about its future.
  • We use adf.test() to test for stationarity.
  • We use acf() and pacf() to examine the lag structure.
  • auto.arima is our friend.
  • But what about other factors?

Distributed Lag Models

\[\begin{align}Y_t = f(&Y_{t-1}, ..., Y_{t-p},\\ &X_{t-1}, ..., X_{t-q})\end{align}\]

  • ARDL(p, q) models suggest the history of another variable will teach us about our variable’s future.
  • We still need stationarity and a lag structure.
  • To forecast with this, we often need multiple equations/models.

Vector Autoregression

Chris Sims’s Macroeconomics and Reality (1980) was hugely influential in empirical macroeconomics.

  • Previously, most people would use ARDL type models.
  • The issue with these models is that they do not allow for “feedback”.

Vector Autoregression

Motivation:

\[Y_{t} = \alpha_{11} Y_{t-1} + \beta_{11} X_{t-1} + e_{1t}\]

  • \(X\) “causes” \(Y\), but \(Y\) cannot influence \(X\).

Vector Autoregression

Solution:

\[Y_{t} = \alpha_{1} Y_{t-1} + \beta_{1} X_{t-1} + e_{1t}\] \[X_{t} = \alpha_{2} Y_{t-1} + \beta_{2} X_{t-1} + e_{2t}\]

Now we can model \(X\) and \(Y\) as functions of past values of one another.

This will allow for this “feedback”.

Vector Autoregression

Of course, errors to this system might be correlated. Therefore, we must decompose them into independent shocks.

\[\begin{align} e_{1t} &= u_{1t} \\ e_{2t} &= \rho e_{1t} + u_{2t} \\ &= \rho u_{1t} + u_{2t} \end{align}\]

Vector Autoregression

Solution:

\[\begin{align}Y_{t} &= \alpha_{1} Y_{t-1} + \beta_{1} X_{t-1} + u_{1t}\\ X_{t} &= \alpha_{2} Y_{t-1} + \beta_{2} X_{t-1} + \rho u_{1t} + u_{2t}\end{align}\]

Important: this ordering has implications on your model. Here, shocks to \(Y\) impact both \(Y\) and \(X\) at the same time. However, a shock to \(X\) only gets to \(Y\) next period through the lagged \(X\).

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

VAR() is the main function

  • y: the dataset (a ts object)
  • p: number of lags
  • lag.max: determines the highest lag when auto-selecting
  • type: “none”, “const”, “trend”, “both”
  • season: integer telling the frequency
  • exogen: exogenous variables (no feedback)

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

restrict() is a function to estimate restricted VARs

  • x: a fitted object from VAR()
  • method: “ser” (automatic based on significance) or “manual”

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

VARselect() is a function to help select lag length

  • Use the same things as in VAR().

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

irf() is a function to help you interpret VAR output. There are many coefficients that are very difficult to interpret, so irf() will guide your interpretation.

Vector Autoregression

How do we estimate these models in R?

Use the package vars (reference manual)

SVAR() is a function to estimate a structural VAR. This is ultimately beyond the scope of the course, but used in more “serious” macroeconomics.

VAR Simulation

Code
N <- 200
y <- rep(0, N)
x <- rep(0, N)
shocksy <- rnorm(N, sd = .2)
shocksx <- .5*shocksy + rnorm(N, sd = .2)
for(i in 3:N){
  
  y[i] <- .4*y[i-1] - .2*y[i-2] - .3*x[i-1] + .6*x[i-2] + shocksy[i]
  x[i] <- .5*y[i-1] - .5*y[i-2] + .5*x[i-1] + .4*x[i-2] + shocksx[i]
}

sim <- ts(cbind(y, x), start = 1, frequency = 1)
head(sim)
Time Series:
Start = 1 
End = 6 
Frequency = 1 
             y           x
1  0.000000000  0.00000000
2  0.000000000  0.00000000
3 -0.055596983  0.08129699
4 -0.070557695  0.03195294
5 -0.002703492 -0.10851724
6  0.118393831  0.10223881

VAR Simulation

Code
par(mfrow = c(1, 2))
plot(y, type = "l"); plot(x, type = "l")

VAR Simulation

Code
tseries::adf.test(sim[,1])
tseries::adf.test(sim[,2])

    Augmented Dickey-Fuller Test

data:  sim[, 1]
Dickey-Fuller = -4.6124, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary


    Augmented Dickey-Fuller Test

data:  sim[, 2]
Dickey-Fuller = -3.8952, Lag order = 5, p-value = 0.01553
alternative hypothesis: stationary

VAR Simulation

Code
acf(sim)

Code
pacf(sim)

VAR Simulation

Code
lagz <- VARselect(sim)
lagz$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     2      2      2      2 

VAR Simulation

Code
# y[i] <- .4*y[i-1] - .2*y[i-2] - .3*x[i-1] + .6*x[i-2] + shocksy[i]
# x[i] <- .5*y[i-1] - .5*y[i-2] + .5*x[i-1] + .4*x[i-2] + shocksx[i]
reg <- VAR(sim, p = 2, season = NULL, type = "none", exogen = NULL); reg

VAR Estimation Results:
======================= 

Estimated coefficients for equation y: 
====================================== 
Call:
y = y.l1 + x.l1 + y.l2 + x.l2 

      y.l1       x.l1       y.l2       x.l2 
 0.4222966 -0.3021403 -0.2895598  0.5779540 


Estimated coefficients for equation x: 
====================================== 
Call:
x = y.l1 + x.l1 + y.l2 + x.l2 

      y.l1       x.l1       y.l2       x.l2 
 0.4275707  0.4589329 -0.3075500  0.3607856 

VAR Simulation

Code
impulse <- irf(reg, impulse = "y", response = "y")
plot(impulse)

Code
impulse <- irf(reg, impulse = "y", response = "x")
plot(impulse)

Code
impulse <- irf(reg, impulse = "x", response = "y")
plot(impulse)

Code
impulse <- irf(reg, impulse = "x", response = "x")
plot(impulse)

VAR Simulation

Variance Decomposition: % of variance due to the other variable

Code
plot(fevd(reg))

VAR Simulation

Code
pred <- predict(reg, n.ahead = 20)
plot(y, type = "l", xlim = c(1, N + 20))
lines((N+1):(N+20), pred$fcst$y[,1], col = "tomato")

Code
pred <- predict(reg, n.ahead = 20)
plot(x, type = "l", xlim = c(1, N + 20))
lines((N+1):(N+20), pred$fcst$x[,1], col = "tomato")
lines((N+1):(N+20), pred$fcst$x[,2], col = "tomato", lty = 2)
lines((N+1):(N+20), pred$fcst$x[,3], col = "tomato", lty = 2)

Code
pred <- predict(reg, n.ahead = 20)
fanchart(pred, names = "x")

Los Angeles Air Quality

LA Air Quality

There is an R package astsa (Applied Statistical Time Series Analysis) that accompanies two textbooks (Time Series Analysis and Its Applications: With R Examples; Time Series: A Data Analysis Approach using R). We are going to use some data from this package.

These data are at the weekly level from Los Angeles County from 1970 - 1979

  • Average weekly cardiovascular mortality
  • Temperatures
  • Air Quality (pollution particulates)
Code
# Average weekly cardiovascular mortality in Los Angeles County
# 508 six-day smoothed averages: 1970-1979
cmort <- astsa::cmort

# Temperatures from LA pollution study
tempr <- astsa::tempr

# Air Quality (particulates) from LA pollution study
part <- astsa::part

LA Air Quality

Code
plot(cmort)

Code
plot(tempr)

Code
plot(part)

LA Air Quality

Code
tseries::adf.test(cmort)

    Augmented Dickey-Fuller Test

data:  cmort
Dickey-Fuller = -5.4125, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary
Code
tseries::adf.test(tempr)

    Augmented Dickey-Fuller Test

data:  tempr
Dickey-Fuller = -4.4572, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary
Code
tseries::adf.test(part)

    Augmented Dickey-Fuller Test

data:  part
Dickey-Fuller = -4.493, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

LA Air Quality

Code
la <- ts.union(cmort = cmort,
               tempr = tempr,
               part = part)

Order Assumptions:

  • A shock to mortality impacts mortality, temperature, and particles
  • A shock to temperature impacts temperature and particles
  • A shock to particles impacts particles

LA Air Quality

Code
la <- ts.union(part = part,
               tempr = tempr,
               cmort = cmort)

Order Assumptions:

  • A shock to particles impacts particles, temperature, and mortality
  • A shock to temperature impacts temperature and mortality
  • A shock to mortality impacts mortality

LA Air Quality

Code
acf(la, lag.max = 52)

Code
pacf(la, lag.max = 52)

LA Air Quality

Code
lagz <- VARselect(la,          # dataset
                  lag.max = 26,# half-year lags as a max
                  season = 52) # controlling for seasonality via week FEs
lagz$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     5      2      2      5 

LA Air Quality

Code
reg <- VAR(la, p=5, type="both", season = 52)
summary(reg)

VAR Estimation Results:
========================= 
Endogenous variables: part, tempr, cmort 
Deterministic variables: both 
Sample size: 503 
Log Likelihood: -4687.423 
Roots of the characteristic polynomial:
0.8598 0.7655 0.6828 0.6828 0.663 0.663 0.6413 0.6281 0.6281 0.6194 0.6194 0.5488 0.5488 0.5222 0.5222
Call:
VAR(y = la, p = 5, type = "both", season = 52L)


Estimation results for equation part: 
===================================== 
part = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1   0.049295   0.065131   0.757 0.449538    
tempr.l1 -0.208162   0.113626  -1.832 0.067637 .  
cmort.l1  0.123986   0.100487   1.234 0.217923    
part.l2   0.144762   0.063992   2.262 0.024179 *  
tempr.l2 -0.001549   0.114346  -0.014 0.989198    
cmort.l2 -0.102222   0.105310  -0.971 0.332252    
part.l3   0.179490   0.064698   2.774 0.005770 ** 
tempr.l3 -0.265257   0.115915  -2.288 0.022594 *  
cmort.l3  0.024809   0.110437   0.225 0.822361    
part.l4   0.239045   0.064091   3.730 0.000217 ***
tempr.l4 -0.312321   0.114887  -2.719 0.006820 ** 
cmort.l4 -0.035780   0.104821  -0.341 0.733011    
part.l5   0.023776   0.064868   0.367 0.714149    
tempr.l5  0.085251   0.115150   0.740 0.459489    
cmort.l5 -0.147366   0.099896  -1.475 0.140883    
const    82.790530  17.824775   4.645 4.52e-06 ***
trend    -0.005258   0.004783  -1.099 0.272252    
sd1       2.362927   4.876084   0.485 0.628207    
sd2      -2.306754   4.870991  -0.474 0.636044    
sd3      -3.060327   4.991713  -0.613 0.540143    
sd4      -5.864355   4.950999  -1.184 0.236870    
sd5       1.915231   5.015943   0.382 0.702775    
sd6      -4.620982   4.986433  -0.927 0.354591    
sd7      -7.212084   5.121048  -1.408 0.159750    
sd8      -3.713790   5.300526  -0.701 0.483899    
sd9      -5.748112   5.339170  -1.077 0.282259    
sd10     -5.039454   5.425063  -0.929 0.353445    
sd11      2.470759   5.572720   0.443 0.657721    
sd12     -3.064717   5.734745  -0.534 0.593330    
sd13      2.305934   5.780915   0.399 0.690172    
sd14     -2.821275   5.814388  -0.485 0.627762    
sd15     -0.436539   5.937560  -0.074 0.941425    
sd16     -3.368925   6.053437  -0.557 0.578134    
sd17      0.404534   6.225082   0.065 0.948216    
sd18      5.816921   6.293288   0.924 0.355840    
sd19      5.508341   6.539902   0.842 0.400102    
sd20     11.960828   6.575750   1.819 0.069610 .  
sd21      6.443608   6.734657   0.957 0.339208    
sd22      9.174171   6.825427   1.344 0.179611    
sd23      6.979634   6.952467   1.004 0.315981    
sd24     16.166939   7.027334   2.301 0.021888 *  
sd25     13.393565   7.063550   1.896 0.058602 .  
sd26      8.787142   7.145392   1.230 0.219450    
sd27      9.982003   7.149967   1.396 0.163399    
sd28      9.154440   7.164670   1.278 0.202030    
sd29     11.900858   7.159697   1.662 0.097193 .  
sd30     12.140906   7.068455   1.718 0.086578 .  
sd31     12.455307   7.074839   1.761 0.079024 .  
sd32     14.969761   6.931012   2.160 0.031332 *  
sd33     14.121541   6.885372   2.051 0.040871 *  
sd34     15.636215   6.831606   2.289 0.022569 *  
sd35     18.742345   6.692165   2.801 0.005327 ** 
sd36     22.713167   6.468197   3.512 0.000492 ***
sd37     12.432550   6.363695   1.954 0.051381 .  
sd38     20.203314   6.243914   3.236 0.001306 ** 
sd39     17.607029   5.920029   2.974 0.003102 ** 
sd40     14.750642   5.781377   2.551 0.011070 *  
sd41     13.700734   5.637624   2.430 0.015493 *  
sd42     10.356809   5.565891   1.861 0.063452 .  
sd43     17.584821   5.310093   3.312 0.001005 ** 
sd44     10.363581   5.402208   1.918 0.055715 .  
sd45     11.652398   5.159027   2.259 0.024400 *  
sd46      6.454610   5.179864   1.246 0.213400    
sd47      5.557025   5.208122   1.067 0.286567    
sd48      3.248403   5.172188   0.628 0.530299    
sd49      7.915153   5.226536   1.514 0.130646    
sd50      4.050254   5.123964   0.790 0.429694    
sd51      6.244605   4.926960   1.267 0.205678    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 9.951 on 435 degrees of freedom
Multiple R-Squared: 0.6256, Adjusted R-squared: 0.568 
F-statistic: 10.85 on 67 and 435 DF,  p-value: < 2.2e-16 


Estimation results for equation tempr: 
====================================== 
tempr = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1  -0.001671   0.037279  -0.045 0.964260    
tempr.l1  0.041465   0.065037   0.638 0.524090    
cmort.l1 -0.055240   0.057516  -0.960 0.337373    
part.l2  -0.033307   0.036628  -0.909 0.363676    
tempr.l2  0.162687   0.065449   2.486 0.013304 *  
cmort.l2 -0.035054   0.060277  -0.582 0.561168    
part.l3   0.041131   0.037031   1.111 0.267306    
tempr.l3 -0.093019   0.066347  -1.402 0.161623    
cmort.l3  0.051724   0.063211   0.818 0.413647    
part.l4   0.072928   0.036684   1.988 0.047437 *  
tempr.l4  0.013721   0.065758   0.209 0.834810    
cmort.l4  0.020643   0.059997   0.344 0.730961    
part.l5  -0.063262   0.037129  -1.704 0.089124 .  
tempr.l5  0.101658   0.065909   1.542 0.123703    
cmort.l5 -0.101582   0.057178  -1.777 0.076333 .  
const    67.569622  10.202446   6.623 1.04e-10 ***
trend    -0.001629   0.002738  -0.595 0.552077    
sd1      -1.095339   2.790946  -0.392 0.694909    
sd2      -0.252820   2.788031  -0.091 0.927788    
sd3      -0.554623   2.857129  -0.194 0.846173    
sd4      -4.655073   2.833825  -1.643 0.101171    
sd5       0.503427   2.870998   0.175 0.860887    
sd6      -0.869841   2.854107  -0.305 0.760689    
sd7       0.990936   2.931157   0.338 0.735474    
sd8       1.071027   3.033885   0.353 0.724243    
sd9      -2.118620   3.056004  -0.693 0.488513    
sd10     -0.276556   3.105167  -0.089 0.929073    
sd11      3.948477   3.189683   1.238 0.216424    
sd12      0.622946   3.282422   0.190 0.849568    
sd13      3.535714   3.308848   1.069 0.285859    
sd14      3.236139   3.328007   0.972 0.331394    
sd15      4.730365   3.398508   1.392 0.164665    
sd16      4.302211   3.464833   1.242 0.215024    
sd17      4.348371   3.563078   1.220 0.222975    
sd18      8.533858   3.602117   2.369 0.018266 *  
sd19      9.927234   3.743273   2.652 0.008294 ** 
sd20     12.842438   3.763791   3.412 0.000705 ***
sd21      9.868539   3.854746   2.560 0.010801 *  
sd22     10.639028   3.906700   2.723 0.006724 ** 
sd23      9.644718   3.979414   2.424 0.015772 *  
sd24     13.108459   4.022266   3.259 0.001206 ** 
sd25     12.884889   4.042996   3.187 0.001541 ** 
sd26     10.590151   4.089840   2.589 0.009937 ** 
sd27      8.979146   4.092458   2.194 0.028758 *  
sd28      8.316349   4.100874   2.028 0.043175 *  
sd29      9.496886   4.098027   2.317 0.020944 *  
sd30     10.888702   4.045803   2.691 0.007391 ** 
sd31      6.914916   4.049457   1.708 0.088421 .  
sd32      8.385632   3.967134   2.114 0.035103 *  
sd33     10.177830   3.941011   2.583 0.010133 *  
sd34      7.287770   3.910237   1.864 0.063028 .  
sd35      5.038358   3.830424   1.315 0.189084    
sd36      4.984268   3.702231   1.346 0.178911    
sd37      1.531341   3.642416   0.420 0.674387    
sd38      1.457768   3.573857   0.408 0.683549    
sd39      3.477229   3.388473   1.026 0.305371    
sd40     -1.924273   3.309112  -0.582 0.561200    
sd41     -1.304773   3.226832  -0.404 0.686153    
sd42     -1.972773   3.185774  -0.619 0.536079    
sd43     -1.775810   3.039361  -0.584 0.559341    
sd44     -4.662356   3.092086  -1.508 0.132323    
sd45     -1.696974   2.952895  -0.575 0.565804    
sd46     -4.672863   2.964822  -1.576 0.115729    
sd47     -9.498172   2.980996  -3.186 0.001545 ** 
sd48     -0.831837   2.960428  -0.281 0.778855    
sd49      0.227368   2.991535   0.076 0.939451    
sd50     -0.917113   2.932826  -0.313 0.754654    
sd51     -2.938001   2.820066  -1.042 0.298074    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 5.696 on 435 degrees of freedom
Multiple R-Squared: 0.6558, Adjusted R-squared: 0.6028 
F-statistic: 12.37 on 67 and 435 DF,  p-value: < 2.2e-16 


Estimation results for equation cmort: 
====================================== 
cmort = part.l1 + tempr.l1 + cmort.l1 + part.l2 + tempr.l2 + cmort.l2 + part.l3 + tempr.l3 + cmort.l3 + part.l4 + tempr.l4 + cmort.l4 + part.l5 + tempr.l5 + cmort.l5 + const + trend + sd1 + sd2 + sd3 + sd4 + sd5 + sd6 + sd7 + sd8 + sd9 + sd10 + sd11 + sd12 + sd13 + sd14 + sd15 + sd16 + sd17 + sd18 + sd19 + sd20 + sd21 + sd22 + sd23 + sd24 + sd25 + sd26 + sd27 + sd28 + sd29 + sd30 + sd31 + sd32 + sd33 + sd34 + sd35 + sd36 + sd37 + sd38 + sd39 + sd40 + sd41 + sd42 + sd43 + sd44 + sd45 + sd46 + sd47 + sd48 + sd49 + sd50 + sd51 

          Estimate Std. Error t value Pr(>|t|)    
part.l1  -0.036619   0.032115  -1.140  0.25482    
tempr.l1 -0.142525   0.056028  -2.544  0.01131 *  
cmort.l1  0.284268   0.049549   5.737 1.81e-08 ***
part.l2  -0.021725   0.031554  -0.688  0.49151    
tempr.l2 -0.059241   0.056383  -1.051  0.29399    
cmort.l2  0.333741   0.051928   6.427 3.42e-10 ***
part.l3  -0.054089   0.031902  -1.695  0.09070 .  
tempr.l3  0.030826   0.057157   0.539  0.58994    
cmort.l3  0.038001   0.054456   0.698  0.48565    
part.l4   0.061244   0.031603   1.938  0.05328 .  
tempr.l4  0.033040   0.056650   0.583  0.56004    
cmort.l4 -0.019117   0.051687  -0.370  0.71166    
part.l5   0.054080   0.031986   1.691  0.09160 .  
tempr.l5 -0.092971   0.056780  -1.637  0.10227    
cmort.l5 -0.041721   0.049258  -0.847  0.39747    
const    55.993042   8.789274   6.371 4.80e-10 ***
trend    -0.011889   0.002358  -5.041 6.81e-07 ***
sd1      -4.582957   2.404363  -1.906  0.05730 .  
sd2      -2.947629   2.401852  -1.227  0.22040    
sd3      -6.578345   2.461380  -2.673  0.00781 ** 
sd4      -4.769697   2.441304  -1.954  0.05137 .  
sd5      -4.018792   2.473327  -1.625  0.10492    
sd6      -6.467114   2.458776  -2.630  0.00884 ** 
sd7      -5.489959   2.525153  -2.174  0.03024 *  
sd8      -4.609526   2.613653  -1.764  0.07850 .  
sd9      -6.693363   2.632708  -2.542  0.01136 *  
sd10     -4.795991   2.675061  -1.793  0.07369 .  
sd11     -6.226300   2.747870  -2.266  0.02395 *  
sd12     -7.650352   2.827763  -2.705  0.00709 ** 
sd13     -4.407427   2.850530  -1.546  0.12279    
sd14     -8.322352   2.867035  -2.903  0.00389 ** 
sd15     -2.348161   2.927770  -0.802  0.42297    
sd16     -7.139528   2.984908  -2.392  0.01719 *  
sd17     -8.374923   3.069545  -2.728  0.00662 ** 
sd18     -7.096146   3.103177  -2.287  0.02269 *  
sd19     -3.315950   3.224781  -1.028  0.30439    
sd20     -0.769079   3.242457  -0.237  0.81262    
sd21     -5.635944   3.320813  -1.697  0.09038 .  
sd22     -5.116042   3.365571  -1.520  0.12921    
sd23     -4.586438   3.428214  -1.338  0.18164    
sd24     -5.490564   3.465130  -1.585  0.11380    
sd25      0.449638   3.482988   0.129  0.89734    
sd26     -4.547571   3.523343  -1.291  0.19749    
sd27     -5.645508   3.525599  -1.601  0.11004    
sd28     -6.081322   3.532850  -1.721  0.08590 .  
sd29     -4.730239   3.530397  -1.340  0.18099    
sd30     -2.384915   3.485407  -0.684  0.49418    
sd31     -3.381577   3.488555  -0.969  0.33292    
sd32     -2.232439   3.417634  -0.653  0.51396    
sd33     -3.286779   3.395130  -0.968  0.33354    
sd34     -1.362172   3.368618  -0.404  0.68614    
sd35     -5.115101   3.299861  -1.550  0.12185    
sd36     -2.747183   3.189423  -0.861  0.38952    
sd37     -2.287542   3.137894  -0.729  0.46639    
sd38      0.843316   3.078831   0.274  0.78429    
sd39     -2.820894   2.919126  -0.966  0.33441    
sd40     -1.032914   2.850757  -0.362  0.71728    
sd41     -3.007494   2.779874  -1.082  0.27990    
sd42     -2.193376   2.744503  -0.799  0.42462    
sd43      0.099277   2.618370   0.038  0.96977    
sd44     -0.991298   2.663792  -0.372  0.70997    
sd45      0.875592   2.543881   0.344  0.73087    
sd46      7.240833   2.554155   2.835  0.00480 ** 
sd47      6.080738   2.568089   2.368  0.01833 *  
sd48     -3.756551   2.550370  -1.473  0.14149    
sd49     -2.657159   2.577169  -1.031  0.30310    
sd50     -2.552333   2.526592  -1.010  0.31297    
sd51     -1.382398   2.429450  -0.569  0.56964    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 4.907 on 435 degrees of freedom
Multiple R-Squared: 0.7913, Adjusted R-squared: 0.7591 
F-statistic: 24.61 on 67 and 435 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
       part tempr cmort
part  99.03 38.46 11.99
tempr 38.46 32.44  7.23
cmort 11.99  7.23 24.08

Correlation matrix of residuals:
        part  tempr  cmort
part  1.0000 0.6786 0.2455
tempr 0.6786 1.0000 0.2587
cmort 0.2455 0.2587 1.0000

LA Air Quality

Plot these in RStudio:

Code
impulse <- irf(reg)
plot(impulse)

Make predictions:

Code
H <- 26
p <- predict(reg, n.ahead = H)

xlimz <- c(min(time(cmort)), max(time(cmort)) + (H/52))
xz <- seq(max(time(cmort)) + (1/52), max(time(cmort)) + (H/52), 1/52)

LA Air Quality

Code
plot(cmort, xlim = xlimz)
lines(xz, p$fcst$cmort[,1], col = "tomato")

Code
plot(tempr, xlim = xlimz)
lines(xz, p$fcst$tempr[,1], col = "tomato")

Code
plot(part, xlim = xlimz)
lines(xz, p$fcst$part[,1], col = "tomato")

AIDS

AIDS Deaths

Code
tmp3 <- readRDS("../data/aids.RDS")

tmp <- aggregate(list(death = tmp3$deaths,
                      diag = tmp3$diag),
                 list(month = tmp3$month),
                 sum, na.rm = TRUE)
ifelse(tmp$death == 0, NA, tmp$death) -> tmp$death
ifelse(tmp$diag == 0, NA, tmp$diag) -> tmp$diag
tmp <- tmp[year(tmp$month) >= 1987,]

lim1 <- tmp$month <= ymd("1994-04-01") & !is.na(tmp$death)

plot_all <- function(months = 0){
  
  plot(tmp$month, tmp$diag, type = "n",
       xlab = "Month", ylab = "Quantity",
       xlim = c(min(tmp$month), ymd("1999-01-01")))
  lines(tmp$month[lim1], tmp$diag[lim1], type = "l", col = "black", lwd = 3)
  lines(tmp$month[lim1] - months(months), tmp$death[lim1], col = "dodgerblue", lwd = 3)
  abline(v = ymd(c("1994-04-15")), lty = c(2))
  legend("topright", bty = "n", col = c("black", "dodgerblue"), lty = 1,
         legend = c("Diagnoses", "Deaths"), lwd = 2)
}

aids_ts <- ts(tmp[lim1,3:2], start = c(min(year(tmp$month[lim1]))), frequency = 12)
reg1 <- auto.arima(aids_ts[,"death"])
H <- 24
p1 <- predict(reg1, n.ahead = H)
plot_all()
lines(tmp$month[!lim1][1:H],
      p1$pred, col = "tomato", lwd = 3)
lines(tmp$month[!lim1][1:H],
      tmp$death[!lim1][1:H], col = "mediumseagreen", lwd = 3)
legend("bottomright", bty = "n", col = c("tomato", "mediumseagreen"), lty = 1,
       legend = c("ARIMA", "Actual"), lwd = 2)

AIDS Deaths

ARIMA is fine so long as nothing about the system changes. There is a clear change in diagnoses, which should change death outcomes.

Let’s use VAR to forecast deaths.

AIDS Deaths

First, let’s select a lag for our model.

Code
lagz <- VARselect(aids_ts, lag.max = 36)
lagz$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
    26     26     26     25 

It seems to want 26 lags.

AIDS Deaths

Code
reg <- VAR(aids_ts, p = 26, type = "both")
summary(reg)

VAR Estimation Results:
========================= 
Endogenous variables: diag, death 
Deterministic variables: both 
Sample size: 62 
Log Likelihood: -601.179 
Roots of the characteristic polynomial:
1.109 1.109 1.026 1.026 1.022 1.022 1.018 1.012 1.012 1.012 1.012  1.01  1.01 1.009 1.009 1.009 1.009 1.008 1.008 1.006 1.006 1.002 1.002 1.001 1.001 0.9937 0.9937 0.9928 0.9928 0.9869 0.9869 0.9861 0.9861 0.9859 0.9859 0.9803 0.9803 0.9787 0.9787 0.9633 0.962 0.962 0.9595 0.9595 0.9547 0.9547 0.9517 0.9517 0.9067 0.9067 0.8595 0.8595
Call:
VAR(y = aids_ts, p = 26, type = "both")


Estimation results for equation diag: 
===================================== 
diag = diag.l1 + death.l1 + diag.l2 + death.l2 + diag.l3 + death.l3 + diag.l4 + death.l4 + diag.l5 + death.l5 + diag.l6 + death.l6 + diag.l7 + death.l7 + diag.l8 + death.l8 + diag.l9 + death.l9 + diag.l10 + death.l10 + diag.l11 + death.l11 + diag.l12 + death.l12 + diag.l13 + death.l13 + diag.l14 + death.l14 + diag.l15 + death.l15 + diag.l16 + death.l16 + diag.l17 + death.l17 + diag.l18 + death.l18 + diag.l19 + death.l19 + diag.l20 + death.l20 + diag.l21 + death.l21 + diag.l22 + death.l22 + diag.l23 + death.l23 + diag.l24 + death.l24 + diag.l25 + death.l25 + diag.l26 + death.l26 + const + trend 

            Estimate Std. Error t value Pr(>|t|)   
diag.l1      0.91459    0.26415   3.462  0.00854 **
death.l1     0.74344    0.75139   0.989  0.35144   
diag.l2     -0.24728    0.34638  -0.714  0.49559   
death.l2    -0.70260    0.74795  -0.939  0.37503   
diag.l3      0.46273    0.33933   1.364  0.20981   
death.l3     1.65881    0.78609   2.110  0.06785 . 
diag.l4     -0.40429    0.35695  -1.133  0.29017   
death.l4    -0.81625    0.75454  -1.082  0.31087   
diag.l5      0.05712    0.34284   0.167  0.87181   
death.l5     2.41450    0.96868   2.493  0.03737 * 
diag.l6     -0.48221    0.34180  -1.411  0.19599   
death.l6    -0.69865    0.96001  -0.728  0.48752   
diag.l7      0.38831    0.34204   1.135  0.28912   
death.l7     0.35698    0.91073   0.392  0.70531   
diag.l8     -0.21909    0.34140  -0.642  0.53899   
death.l8     0.44903    0.72824   0.617  0.55464   
diag.l9      0.11800    0.34561   0.341  0.74158   
death.l9     0.32641    0.74226   0.440  0.67175   
diag.l10    -0.28887    0.33323  -0.867  0.41126   
death.l10    0.06240    0.75167   0.083  0.93588   
diag.l11    -0.06113    0.30769  -0.199  0.84748   
death.l11    0.96197    0.80274   1.198  0.26507   
diag.l12    -0.35203    0.31775  -1.108  0.30010   
death.l12   -0.91879    0.82207  -1.118  0.29615   
diag.l13     0.23901    0.35296   0.677  0.51741   
death.l13   -0.22773    0.87274  -0.261  0.80073   
diag.l14    -0.44152    0.33957  -1.300  0.22973   
death.l14    0.39246    0.89213   0.440  0.67164   
diag.l15     0.29627    0.37644   0.787  0.45394   
death.l15    0.69053    0.80835   0.854  0.41781   
diag.l16    -0.29586    0.41334  -0.716  0.49449   
death.l16   -1.24757    0.78172  -1.596  0.14917   
diag.l17     0.22130    0.52501   0.422  0.68447   
death.l17    1.91633    0.80747   2.373  0.04502 * 
diag.l18    -0.63295    0.52980  -1.195  0.26642   
death.l18   -0.26342    0.87922  -0.300  0.77212   
diag.l19     0.78997    0.43577   1.813  0.10743   
death.l19    0.88485    0.83346   1.062  0.31939   
diag.l20    -1.13676    0.46893  -2.424  0.04158 * 
death.l20    0.30267    0.72426   0.418  0.68701   
diag.l21     0.52582    0.55424   0.949  0.37054   
death.l21    0.13488    0.74667   0.181  0.86114   
diag.l22    -0.91544    0.58754  -1.558  0.15782   
death.l22   -0.40932    0.80453  -0.509  0.62464   
diag.l23     0.68284    0.53018   1.288  0.23377   
death.l23    1.82128    0.79317   2.296  0.05077 . 
diag.l24    -0.82178    0.59270  -1.386  0.20301   
death.l24   -0.18240    0.70077  -0.260  0.80122   
diag.l25     0.03133    0.56522   0.055  0.95715   
death.l25   -0.21098    0.65400  -0.323  0.75527   
diag.l26    -1.35724    0.54885  -2.473  0.03854 * 
death.l26    0.25693    0.67941   0.378  0.71514   
const     -811.01348 1850.88473  -0.438  0.67285   
trend      -90.31747   62.68194  -1.441  0.18759   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 132.6 on 8 degrees of freedom
Multiple R-Squared: 0.9815, Adjusted R-squared: 0.8589 
F-statistic: 8.006 on 53 and 8 DF,  p-value: 0.002053 


Estimation results for equation death: 
====================================== 
death = diag.l1 + death.l1 + diag.l2 + death.l2 + diag.l3 + death.l3 + diag.l4 + death.l4 + diag.l5 + death.l5 + diag.l6 + death.l6 + diag.l7 + death.l7 + diag.l8 + death.l8 + diag.l9 + death.l9 + diag.l10 + death.l10 + diag.l11 + death.l11 + diag.l12 + death.l12 + diag.l13 + death.l13 + diag.l14 + death.l14 + diag.l15 + death.l15 + diag.l16 + death.l16 + diag.l17 + death.l17 + diag.l18 + death.l18 + diag.l19 + death.l19 + diag.l20 + death.l20 + diag.l21 + death.l21 + diag.l22 + death.l22 + diag.l23 + death.l23 + diag.l24 + death.l24 + diag.l25 + death.l25 + diag.l26 + death.l26 + const + trend 

            Estimate Std. Error t value Pr(>|t|)
diag.l1    3.110e-02  1.114e-01   0.279    0.787
death.l1  -2.196e-01  3.169e-01  -0.693    0.508
diag.l2    1.589e-02  1.461e-01   0.109    0.916
death.l2  -1.708e-01  3.154e-01  -0.541    0.603
diag.l3    1.266e-01  1.431e-01   0.884    0.402
death.l3  -2.890e-01  3.315e-01  -0.872    0.409
diag.l4   -1.089e-02  1.505e-01  -0.072    0.944
death.l4  -2.824e-01  3.182e-01  -0.888    0.401
diag.l5   -8.553e-03  1.446e-01  -0.059    0.954
death.l5  -1.864e-01  4.085e-01  -0.456    0.660
diag.l6    1.982e-01  1.441e-01   1.375    0.206
death.l6   2.025e-01  4.049e-01   0.500    0.630
diag.l7    2.507e-02  1.442e-01   0.174    0.866
death.l7  -1.807e-01  3.841e-01  -0.470    0.651
diag.l8   -6.617e-02  1.440e-01  -0.460    0.658
death.l8  -5.283e-02  3.071e-01  -0.172    0.868
diag.l9   -7.170e-03  1.458e-01  -0.049    0.962
death.l9   4.163e-02  3.130e-01   0.133    0.897
diag.l10   5.240e-02  1.405e-01   0.373    0.719
death.l10  1.271e-01  3.170e-01   0.401    0.699
diag.l11   7.924e-02  1.298e-01   0.611    0.558
death.l11 -4.161e-02  3.385e-01  -0.123    0.905
diag.l12  -8.518e-02  1.340e-01  -0.636    0.543
death.l12 -1.529e-01  3.467e-01  -0.441    0.671
diag.l13  -1.909e-01  1.489e-01  -1.282    0.236
death.l13  3.186e-01  3.681e-01   0.866    0.412
diag.l14   6.782e-03  1.432e-01   0.047    0.963
death.l14 -1.034e-01  3.762e-01  -0.275    0.790
diag.l15  -1.363e-01  1.588e-01  -0.859    0.416
death.l15 -5.079e-01  3.409e-01  -1.490    0.175
diag.l16  -2.858e-01  1.743e-01  -1.639    0.140
death.l16 -1.218e-01  3.297e-01  -0.369    0.721
diag.l17   9.481e-02  2.214e-01   0.428    0.680
death.l17 -1.504e-01  3.405e-01  -0.442    0.670
diag.l18   1.445e-01  2.234e-01   0.647    0.536
death.l18  2.284e-02  3.708e-01   0.062    0.952
diag.l19   3.343e-01  1.838e-01   1.819    0.106
death.l19 -2.610e-01  3.515e-01  -0.743    0.479
diag.l20  -9.826e-02  1.978e-01  -0.497    0.633
death.l20 -3.157e-01  3.054e-01  -1.034    0.332
diag.l21   2.435e-01  2.337e-01   1.042    0.328
death.l21  3.914e-01  3.149e-01   1.243    0.249
diag.l22   3.305e-01  2.478e-01   1.334    0.219
death.l22 -1.125e-01  3.393e-01  -0.332    0.749
diag.l23   1.033e-01  2.236e-01   0.462    0.656
death.l23 -4.481e-01  3.345e-01  -1.340    0.217
diag.l24   1.121e-01  2.500e-01   0.448    0.666
death.l24 -1.779e-02  2.955e-01  -0.060    0.953
diag.l25   6.164e-02  2.384e-01   0.259    0.802
death.l25 -5.789e-02  2.758e-01  -0.210    0.839
diag.l26   2.179e-01  2.315e-01   0.941    0.374
death.l26  3.522e-01  2.865e-01   1.229    0.254
const      1.233e+03  7.806e+02   1.579    0.153
trend      4.191e+01  2.643e+01   1.585    0.152


Residual standard error: 55.93 on 8 degrees of freedom
Multiple R-Squared: 0.9965, Adjusted R-squared: 0.9736 
F-statistic: 43.51 on 53 and 8 DF,  p-value: 3.4e-06 



Covariance matrix of residuals:
         diag  death
diag  17585.9 -753.6
death  -753.6 3127.6

Correlation matrix of residuals:
         diag   death
diag   1.0000 -0.1016
death -0.1016  1.0000

AIDS Deaths

Code
p <- predict(reg, n.ahead = H)
plot_all()
lines(tmp$month[!lim1][1:H],
      p$fcst$death[,1], col = "tomato", lwd = 3)
lines(tmp$month[!lim1][1:H],
      tmp$death[!lim1][1:H], col = "mediumseagreen", lwd = 3)

AIDS Deaths

Code
reg <- restrict(reg, "ser")
summary(reg)

VAR Estimation Results:
========================= 
Endogenous variables: diag, death 
Deterministic variables: both 
Sample size: 62 
Log Likelihood: -669.214 
Roots of the characteristic polynomial:
1.069 1.069 1.023 1.021 1.012 1.012  1.01  1.01 1.005 1.005 1.004 1.004 1.003 1.003 1.001 1.001 0.9997 0.9997 0.9963 0.9963 0.9948 0.9948 0.9921 0.9921 0.9814 0.9814 0.9813 0.9813 0.9808 0.9808 0.9775 0.9775 0.9774 0.9774 0.9712 0.9712  0.96  0.96 0.9595 0.9595 0.9444 0.9444 0.9203 0.9203 0.8683 0.8683 0.8399 0.8399 0.8252 0.8252 0.7877 0.7877
Call:
VAR(y = aids_ts, p = 26, type = "both")


Estimation results for equation diag: 
===================================== 
diag = diag.l1 + diag.l3 + death.l3 + death.l4 + death.l5 + diag.l6 + death.l11 + diag.l12 + diag.l14 + death.l16 + death.l17 + diag.l19 + diag.l20 + diag.l23 + diag.l26 + trend 

           Estimate Std. Error t value Pr(>|t|)    
diag.l1     0.64036    0.07016   9.127 6.84e-12 ***
diag.l3     0.42641    0.07982   5.342 2.76e-06 ***
death.l3    0.80525    0.18725   4.300 8.79e-05 ***
death.l4   -1.28455    0.26339  -4.877 1.33e-05 ***
death.l5    1.17081    0.22796   5.136 5.56e-06 ***
diag.l6    -0.18211    0.08867  -2.054  0.04570 *  
death.l11   0.46867    0.20581   2.277  0.02747 *  
diag.l12   -0.23376    0.10175  -2.298  0.02619 *  
diag.l14   -0.25568    0.11681  -2.189  0.03372 *  
death.l16  -1.16771    0.24095  -4.846 1.47e-05 ***
death.l17   1.22198    0.27773   4.400 6.38e-05 ***
diag.l19    0.27045    0.09280   2.914  0.00549 ** 
diag.l20   -0.50602    0.10329  -4.899 1.23e-05 ***
diag.l23    0.49617    0.15581   3.184  0.00260 ** 
diag.l26   -0.32102    0.14091  -2.278  0.02741 *  
trend     -15.29840    5.68692  -2.690  0.00992 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 111.4 on 46 degrees of freedom
Multiple R-Squared: 0.9982, Adjusted R-squared: 0.9976 
F-statistic:  1616 on 16 and 46 DF,  p-value: < 2.2e-16 


Estimation results for equation death: 
====================================== 
death = diag.l3 + diag.l6 + death.l6 + diag.l8 + diag.l13 + death.l13 + diag.l15 + death.l15 + diag.l16 + diag.l17 + diag.l18 + diag.l19 + diag.l20 + diag.l21 + death.l21 + diag.l22 + death.l23 + death.l26 + const 

           Estimate Std. Error t value Pr(>|t|)    
diag.l3     0.09406    0.02721   3.457 0.001242 ** 
diag.l6     0.11725    0.03299   3.554 0.000938 ***
death.l6    0.24161    0.06975   3.464 0.001218 ** 
diag.l8    -0.10485    0.03329  -3.149 0.002976 ** 
diag.l13   -0.15156    0.03180  -4.766 2.17e-05 ***
death.l13   0.24738    0.09600   2.577 0.013487 *  
diag.l15   -0.07933    0.03753  -2.114 0.040362 *  
death.l15  -0.49827    0.08895  -5.601 1.38e-06 ***
diag.l16   -0.17866    0.04351  -4.106 0.000176 ***
diag.l17    0.12655    0.03609   3.507 0.001076 ** 
diag.l18    0.20603    0.04743   4.344 8.37e-05 ***
diag.l19    0.30872    0.03938   7.840 7.99e-10 ***
diag.l20   -0.11776    0.05036  -2.339 0.024080 *  
diag.l21    0.17410    0.04634   3.757 0.000513 ***
death.l21   0.31587    0.09687   3.261 0.002178 ** 
diag.l22    0.18590    0.04450   4.177 0.000142 ***
death.l23  -0.34186    0.10353  -3.302 0.001937 ** 
death.l26   0.42514    0.07872   5.400 2.70e-06 ***
const     215.61912  103.20358   2.089 0.042637 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 35.99 on 43 degrees of freedom
Multiple R-Squared: 0.9998, Adjusted R-squared: 0.9997 
F-statistic: 1.137e+04 on 19 and 43 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
       diag death
diag  71410 -2934
death -2934  6963

Correlation matrix of residuals:
         diag   death
diag   1.0000 -0.1316
death -0.1316  1.0000

AIDS Deaths

Code
p <- predict(reg, n.ahead = H)
plot_all()
lines(tmp$month[!lim1][1:H],
      p$fcst$death[,1], col = "tomato", lwd = 3)
lines(tmp$month[!lim1][1:H],
      tmp$death[!lim1][1:H], col = "mediumseagreen", lwd = 3)

AIDS Deaths

Code
mape <- function(a, f) mean(abs((a-f)/a))

cbind(VAR = round(mape(tmp$death[!lim1][1:H], p$fcst$death[,1]), 4)*100, # VAR
      ARIMA = round(mape(tmp$death[!lim1][1:H], p1$pred), 4)*100, # ARIMA
      Average = round(mape(tmp$death[!lim1][1:H], (p1$pred + p$fcst$death[,1])/2), 4)*100) # VAR
      VAR ARIMA Average
[1,] 6.89  8.39    3.96

AIDS Deaths

Code
plot_all()
lines(tmp$month[!lim1][1:H],
      (p1$pred + p$fcst$death[,1])/2,
      col = alpha("tomato", 1), lwd = 3)
lines(tmp$month[!lim1][1:H],
      (p1$pred),
      col = alpha("grey", .6), lwd = 3)
lines(tmp$month[!lim1][1:H],
      (p$fcst$death[,1]),
      col = alpha("orchid", .6), lwd = 3)
lines(tmp$month[!lim1][1:H],
      tmp$death[!lim1][1:H],
      col = alpha("mediumseagreen", 1), lwd = 3)

AIDS Deaths

Code
ts.union(death = aids_ts[,2],
         diag = aids_ts[,1],
         deathd = diff(diff(log(aids_ts[,2]), lag = 12)),
         diagd = diff(diff(log(aids_ts[,1]), lag = 12))) -> aids_ts2

lim <- 14:nrow(aids_ts)
lagz <- VARselect(aids_ts2[lim,4:3], lag.max = 36)
# lagz$selection

reg <- VAR(aids_ts2[lim,4:3], p = 19, type = "both")
reg <- restrict(reg, "ser")

p <- predict(reg, n.ahead = H)

lim2 <- tmp$month[14:nrow(tmp)] < ymd("1994-05-01")
plot(tmp$month[14:nrow(tmp)][lim2],
     diff(diff(log(tmp$diag), lag = 12))[lim2],
     type = "n",
     xlab = "Month", ylab = "Percent Change in Quantity",
     xlim = c(min(tmp$month), ymd("1998-01-01")))
lines(tmp$month[14:nrow(tmp)][lim2],
      diff(diff(log(tmp$diag), lag = 12))[lim2],
      type = "l", col = "black", lwd = 3)
lines(tmp$month[14:nrow(tmp)][lim2],
      diff(diff(log(tmp$death), lag = 12))[lim2],
      col = "dodgerblue", lwd = 3)

lines(tmp$month[!lim1][1:H],
      p$fcst$death[,1], col = "tomato", lwd = 3)
lines(tmp$month[!lim1][1:H],
      diff(diff(log(tmp$death[c(which(!lim1)[1:H][1] - 13:1, which(!lim1)[1:H])]), lag = 12)),
      col = "mediumseagreen", lwd = 3)

auto.arima(ts(aids_ts2[14:nrow(aids_ts2),"deathd"],
              start = c(1998, 2), frequency = 12)) -> reg2
lines(tmp$month[!lim1][1:H],
      predict(reg2, n.ahead = 24)$pred,
      col = "orchid", lwd = 3)

legend("topright", bty = "n", col = c("black", "dodgerblue"), lty = 1,
         legend = c("Diagnoses", "Deaths"), lwd = 2)

AIDS Deaths

Code
cbind(VAR = round(mape(diff(diff(log(tmp$death[c(which(!lim1)[1:H][1] - 13:1, which(!lim1)[1:H])]), lag = 12)), p$fcst$death[,1]), 4)*100, # VAR
      ARIMA = round(mape(diff(diff(log(tmp$death[c(which(!lim1)[1:H][1] - 13:1, which(!lim1)[1:H])]), lag = 12)), predict(reg2, n.ahead = 24)$pred), 4)*100) # ARIMA
       VAR  ARIMA
[1,] 97.43 112.75

Coefficient Dynamics

Coefficient Dynamics

Thus far, we have assumed that the relationship between \(Y\) and \(X\) remains constant over time.

However, this is not always the case.

  • \(\text{Weight} = f(\text{Height})\) (Link)
  • \(\text{Unemployment} = f(\text{Inflation})\) (Phillips Curve)

Phillips Curve

Phillips Curve

Coefficient Dynamics

At a high level, there are a few ways to test, and account, for structural breaks.

  • Chow Break Test (strucchange)
  • Rolling Regression (rollRegres, note the one s)

Of course, you might want to identify breaks, but you might also want to rule them out and demonstrate the stability of your model. Both of these can help with this as well.

Chow Break Test

\[\frac{\Big(\frac{RSS_p - (RSS_1 + RSS_2)}{k}\Big)}{\Big(\frac{RSS_1 + RSS_2}{N_1 + N_2 - 2k}\Big)}\]

  • \(RSS_p\): Residual Sum of Squares for the entire (pooled) regression
  • \(RSS_1\): Residual Sum of Squares for the first section of data
  • \(RSS_2\): Residual Sum of Squares for the second section of data
  • \(k\): Number of Regressors

Chow Break Test

Code
N <- 200
shocks <- rnorm(N)
x <- rnorm(N)
t1 <- 1:N
t2 <- c(rep(0, .4*N), 1:(.6*N))

y <- 3*x + t1 - t2 + shocks

plot(t1, y, type = "l")

Chow Break Test

Testing for a break:

Code
strucchange::sctest(y ~ x, type = "Chow", point = 80)

    Chow test

data:  y ~ x
F = 171.05, p-value < 2.2e-16

But what if you don’t know where exactly the break happens?

Chow Break Test

Code
v <- rep(0, 180-20)
for(i in 20:180){
  
  v[i-19] <- strucchange::sctest(y ~ x, type = "Chow", point = i)$statistic
}

Chow Break Test

Code
plot(20:180, v, type = "b", pch = 19)

Chow Break Test

So what?

  • This can help you choose your analysis sample.
  • This can inform you of breaks you might want to model.
    • Remember: break models, trend models, kink models, etc.

Rolling Regression

Rolling regression is another tool to help you diagnose breaks or general dynamics in your model.

There are two ways you might use rolling regression:

  • Shifting Window
  • Expanding Window

Rolling Regression

Code
# shifiting window:
r1 <- rollRegres::roll_regres(y ~ x + t1, width = 25L, do_downdates = TRUE)
# expanding window:
r2 <- rollRegres::roll_regres(y ~ x + t1, width = 25L, do_downdates = FALSE)
par(mfrow = c(1, 2))
plot(r1$coefs[,"t1"], type = "b", pch = 19, main = "Shifting Window", ylim = c(-.1, 1.1))
plot(r2$coefs[,"t1"], type = "b", pch = 19, main = "Expanding Window", ylim = c(-.1, 1.1))

Next Classes

Today:

Please Submit Project Data!!

11/23:

  • No Class

11/30:

  • Spatial Autocorrelation
  • Project Help

12/7:

  • Final Presentations